A comparison of acoustic features for articulatory inversion
نویسندگان
چکیده
We study empirically the best acoustic parameterization for articulatory inversion (the problem of recovering the sequence of vocal tract shapes that produce a given acoustic speech signal). We compare all combinations of the following factors: 1) popular acoustic features such as MFCC and PLP with and without dynamic features; 2) different short-time window lengths; 3) different levels of smoothing of the acoustic temporal trajectories. Experimental results on a real speech production database show consistent improvement when using features closely related to the vocal tract (in particular LSF), dynamic features, and large window length and smoothing (which reduce the jaggedness of the acoustic trajectory). Further improvements are obtained with a 15 ms time delay between acoustic and articulatory frames. However, the improvement attained over other combinations is very small (at most 0.3mm RMSE).
منابع مشابه
A New Bidirectional Neural Network Model for the Acoustic- Articulatory Inversion Mapping For Speech Recognition
In this paper, a new bidirectional neural network for better acoustic-articulatory inversion mapping is proposed. The model is motivated by the parallel structure of human brain, processing information by having forward-inverse connections. In other words, there would be a feedback from articulatory system to the acoustic signals emitted from that organ. Inspired by this mechanism, a new bidire...
متن کاملAutomatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.
An automatic speech recognition approach is presented which uses articulatory features estimated by a subject-independent acoustic-to-articulatory inversion. The inversion allows estimation of articulatory features from any talker's speech acoustics using only an exemplary subject's articulatory-to-acoustic map. Results are reported on a broad class phonetic classification experiment on speech ...
متن کاملInformation theoretic acoustic feature selection for acoustic-to-articulatory inversion
We use mutual information as the criterion to rank the Mel frequency cepstral coefficients (MFCCs) and their derivatives according to the information they provide about different articulatory features in acoustic-to-articulatory (AtoA) inversion. It is found that just a small subset of the coefficients encodes maximal information about articulatory features and interestingly, this subset is art...
متن کاملAudiovisual-to-articulatory inversion
It has been shown that acoustic-to-articulatory inversion, i.e. estimation of the articulatory configuration from the corresponding acoustic signal, can be greatly improved by adding visual features extracted from the speaker’s face. In order to make the inversion method usable in a realistic application, these features should be possible to obtain from a monocular frontal face video, where the...
متن کاملPalate-referenced articulatory features for acoustic-to-articulator inversion
The selection of effective articulatory features is an important component of tasks such as acoustic-to-articulator inversion and articulatory synthesis. Although it is common to use direct articulatory sensor measurements as feature variables, this approach fails to incorporate important physiological information such as palate height and shape and thus is not as representative of vocal tract ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007